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ABSTRACT 

Graphical user interfaces (GUIs) encode, as event sequences, 
potentially unbounded ways to interact with software. Dur- 
ing testing it becomes necessary to effectively sample the 
GUI's event space. Ideally, for increasing the efficiency and 
effectiveness of GUI testing, one would like to sample the 
GUI's event space by only generating sequences that (1) 
are allowed by the GUI's structure, and (2) chain together 
only those events that have data dependencies between their 
event handlers. We propose a new model, called an event- 
dependency graph (EDG) of the GUI that captures data 
dependencies between the code of event handlers. We de- 
velop a mapping between an EDG and an existing black-box 
model of the GUI's structure, called an event-flow graph 
(EFG). We automate the EDG construction in a tool that 
analyzes the bytecode of each event handler. We evaluate 
our "grey-box" approach using four open-source applications 
and compare it with the EFG approach. Our results show 
that using the EDG reduces the number of event sequences 
with respect to the EFG, while still achieving at least the 
same coverage. Furthermore, we are able to detect 2 new 
bugs in the subject applications. 
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1. INTRODUCTION 

A particular challenge for system testing of software appli- 
cations that have a graphical user interface (GUI) front-end 
is that the total number of all possible sequences of user 
actions is prohibitively large (in principle, possibly infinite), 
even for relatively small applications. A reasonably sized 
and effective sample needs to be selected for testing. GUI 
testing, i.e., system testing the software through its GUI 
is important, because most of today's software applications 
provide services to end-users via a GUI. 

Each user interaction, e.g., pressing a key on the keyboard 
or clicking a mouse button, triggers an event in the appli- 
cation. An application responds to an event by executing a 
piece of code called the event handler associated with the 
event. In GUI testing, a sequence of events is an integral 
part of a GUI test case. In particular, a GUI test case con- 
sist of (1) a precondition that must hold before executing a 
sequence of events; (2) the actual sequence of events to be 
executed; (3) possible input-data to the GUI; and (4) the 
expected results of the test case (the oracle). 

There has been extensive recent work on developing auto- 
mated model-based GUI testing techniques. Current tech- 
niques (e.g., [l[27l[10l[5l[28l[29l[T2l|3]) use a black-box 
approach to generate test cases. Further, they use a graph- 
based model to represent the possible sequences of events 
with the GUI. Each node in these graph-based models repre- 
sent an event, which is an interaction with one widget (e.g., 
selecting an element in a listbox). A path in this graph- 
based model corresponds to a sequence of events with the 
GUI; this sequence is used in the GUI test case. 

In this paper, we propose and evaluate a grey-box [T4] ap- 
proach for automated GUI testing. The underlying mecha- 
nism for the grey-box approach is a new event-dependency 
graph (EDG) model that captures data dependencies be- 
tween event-handlers in the GUI code. More specifically, 
an EDG is a weighted directed graph in which each node 
represents an event in the GUI. An edge from the node rep- 
resenting event ei to a node representing event 62 shows 
that there is a data dependency from ei's event handler to 



62 's event handler. The weight of the edge represents the 
number of fields that flow from ei's event handler to 62 's 
event handler. Abstract event sequences are generated by 
using a minimax search [2j on the EDG. An abstract event 
sequence is a path through the EDG. Because of the na- 
ture of the EDG model, these abstract event sequence chain 
together only those events that have data dependencies be- 
tween their event handlers. Further, an abstract event se- 
quence does not necessarily mean that their events are al- 
lowed one after the other by the GUI's structure. For exam- 
ple, ei may be an event in the MainWindow, whereas event 
62 may be in the FileOpen dialog. An intermediate event 
that opens the FileOpen dialog is needed before 62. Hence 
abstract event sequences, which are paths in the EDG, may 
not be executable, which is why we called them "abstract" 
event sequences above. To convert abstract event sequences 
into "executable" event sequences, a mapping is maintained 
between the EDG and the GUI's workflow, represented us- 
ing an existing event-flow graph (EFG) black-box model of 
the GUI [20]. After applying the mapping, we obtain event 
sequences that (1) are allowed by the GUI's structure, and 
(2) chain together only those events that have data depen- 
dencies between their event handlers. By embedding these 
executable event sequences into GUI test cases, a compact 
test suite is formed, which efficiently samples GUI event 
space. 

We evaluate the grey-box approach on four open-source 
applications: TerpWord, Rachota, FreeMmd and JabRef. 
The results show a dramatic increase in the efficiency of 
the event sequence generation and execution. Further, one 
new bug in Rachota, and one bug in JabRef is revealed. 

The paper is organized as follows: Section [2] provides the 
background of model-based GUI testing using a black-box 
approach. Section |3] introduces our grey-box GUI testing 
approach, which incorporates an event-flow graph (EFG) 
and an event-dependency graph (EDG) to generate efficient 
event sequences. Section |4] provides an overview of the im- 
plementation, which we use to evaluate the approach (Sec- 
tions [5] through |6} . Section |8] summarizes the related work, 
and finally. Section [9] presents the conclusions and future 
work. 

2. BACKGROUND 

When testing a system through its GUI, only a finite set 
of user interactions can be tested. The choice of this set 
is vital to the success of the testing procedure. A common 
way to sample the possibly infinite set of sequences is to 
use a graph-based model of the GUI, called event-flow graph 
(EFG). 

An event-fiow graph, EFG = {E,I,S), for an application 
is a directed graph. Each node e G i5 is an event in the GUI. 
An event is a response of the system to a user interaction 
(a click on a button triggers an onClick event). Each event 
in J C i5 is an initial event which can be executed directly 
after the application launched. An edge (e, e') G 5 between 
to events e,e' £ E states that the event e' can be executed 
immediately after the event e. Conversely, if there is no 
edge between events 6, e' then event e' cannot be executed 
immediately after event e. This may be owing to structural 
characteristics of the GUI. For example, executing e may 
close the window containing e'. The EFG can be obtained 
automatically from the application using a. GUI Ripper [21j . 
Section [4 . 2 [ outlines the construction of the EFG, its benefits 



and limitations. 

Figure[TIa) shows the GUI of an example application. The 
MainWindow appears when the application is launched. A 
modal dialog Dialog appears when the button 63 is clicked. 
It is closed when the button 64 is clicked. 




(b) EFG 

Figure 1: An Example Application. 

Figure [TJb) shows the corresponding EFG of the example 
application, which consists of 4 events (61 to 64), where the 
events ei, 62, 63 represent initial events. The execution event 
63 opens the modal dialog, s.t. 64 becomes accessible. The 
event 64 closes the Dialog and thus, after 64 is executed, it 
becomes inaccessible again. 

An event sequence in an EFG is a sequence of events which 
represents a sequence of user interactions with the GUI. An 
executable event sequence s = 60,..., e„ is a sequence of 
events which starts with an initial event 60 G /. 

Definition 1. Given an event-flow graph EFG = {E, I, S) . 
An executable event sequence is a sequence of events s = 
60, ... , e„, such that eo G / and {d, 6i+i) G 5 for allO < i < 
n. 

From the EFG, sequences of events of a particular length 
are sampled. For instance, using a sequence length of 1 leads 
to the following event sequences: si — (ei), S2 = (e2), S3 = 
(es), and S4 = (63, e4). Note that sequence 54 has length 

2. This is because 64 cannot be tested with a sequence of 
length 1, therefore additional reaching steps are introduced 
to connect 64 to an initial event of the EFG. 

Although event sequences of length 1 provide a compact 
set, it is certainly not sufficient for bug detection, since pairs 
and triples of events are not considered. However, increas- 
ing the length of the event sequences does not scale, as the 
number of generated sequences grows exponentially (Table[l] 
shows all event sequences generated with a length of 2 for 
the EFG in Figure [TJb)). That is, a better technique for 
sampling the EFG is needed in order to generate event 
sequences with a reasonable length. In the following we 
present a technique to efficiently generate a compact set of 
relevant event sequences of arbitrary length. 

3. GREY-BOX GUI TESTING 

An EFG is useful to generate feasible event sequences. 
However, when generating longer event sequences the num- 
ber sequences becomes prohibitively large and a more so- 
phisticated sampling strategy is needed. 



51 = (ei, ei) 

52 = (61,62) 

53 = (61,63) 

54 = (62, 61) 



55 = (62,62) 

56 — (62,63) 

57 = (63, 64) 

Ss = (63,64,61) 



S9 = (63,64,62) 
SlO = (63, 64, 63) 



Table 1: Generated Event Sequences using an EFG 
Sequences Length of 2 

To efficiently sample the event sequences generated from 
an EFG, we propose to incorporate additional information 
from the source code of the event handlers. Knowing which 
fields are modified and which are read upon the execution of 
an event makes it possible to prioritize sequences of events 
where the event handlers influence each other and to avoid 
those sequences, where events are completely independent 
(e.g., a Copy and a Help button in a word processor). 



class MainWindow { 

boolean enabled = true ; 
String text = "Hello World" 

void el () { 

enabled = false; 

} 

void e2() { 

text = text . toLowerCase () 

} 

void e3 () { 

if ( enabled ) 

openDialog ( this ) ; 
else 

Log.write(text) ; 

} 



class Dialog { 

MainWindow mainWindow 



void e4() { 

mainWindow . text 
closeDialog ( ) ; 

} 



null ; 



Listing 1: Java Snippet of the Example Event 
Handlers. 

Listing [1] shows the Java snippet of the example appli- 
cation, especially of their 4 event handlers. The example 
application consists of the classes MainWindow and Dialog, 
where MainWindow contains three event handlers (el, e2, and 
e3), and Dialog the event handler e4. Event handler el sets 
the field enabled to false, and e2 converts the string of field 
text to lower case. In e3, the field enabled is evaluated in a 
conditional. If enabled is true, the dialog is opened and the 
current instance this of MainWindow is passed to the dialog. 
If enabled is false, the content of text is written to a log. 
Event handler e4 sets the field text of the current instance 
of MainWindow to null and closes the dialog. 

The execution of the event sequence (63,64,62) throws a 
NullPointerException, because the field text in e2 was 
set to null in e4. This example application is a simplified 
version of a bug which we found in real world applications. 

Without considering the application's source code, in the 
worst case, all sequences of length 2 must be generated and 



executed to detect the bug. For our example, this leads to 10 
event sequences in total. When analyzing the source code 
of an application, we observe that certain event handlers 
share a data dependency, which helps to prefer or to neglect 
certain events from event sequence generation: Event el 
writes field enabled which is read in e3; e4 writes field text, 
which is read in e2. Further, there is no data dependency 
between 61 and 62. To utilize these data dependencies for a 
more efficient event sequence generation, we introduce a new 
graph-based mode called event-dependency graph (EDG). 

3.1 Event-dependency Graph 

An event-dependency graph EDG = {E,ip) is a directed 
graph where, like in the EFG, each node in E represents 
a GUI event. Note that in contrast to the EFG, an EDG 
does not have initial events since it represents data depen- 
dency and not control-fiow. An edge (6, w, e') £ tp is labeled 
with a weight w. The weight w G N'^ indicates the data 
dependency between e and e' . 

The edge value (w) is computed as follows; All fields which 
are written in the event handler of e are collected in a set 
W. All fields that are read in the event handler of e' are 
collected in a set R. For each event handler, we recursively 
follow potential method calls, collect these fields, and place 
them in set W and R respectively. The edge from e to e' is 
labeled with the size of the intersection of these set |i?nTy|. 

A path TT = 6i . . . 6j in the EDG represents a sequence 
of events, where the execution of one event always changes 
fields which are read by the succeeding event. However, it 
is not necessary that two events in question can be executed 
consecutively in the GUI. The benefit of these sequences 
is that the execution of one event might change relevant 
fields for the execution of its successor and cause this one 
to execute other code fragments. This can lead to a higher 
code coverage and further reduce the amount of code that 
is tested redundantly. Since the EDG has no initial events, 
and succeeding events on a path in the EDG might not be 
directly executable in the GUI, we refer to an EDG path as 
an abstract event sequence. 

Definition 2. Given an event-dependency graph EDG = 
{E,ip). An abstract event sequence is a sequence of events 
n = 6i, . . . , 6j, such that (ek,ek+i) G 5 for all i < k < j. 



Algorithm 1: Construction of the EDG. 

Input: P : Program, 

(E, I, (5) : Event-flow graph 
Output: {E', 4>) : Evcnt-dcpcndcncy graph 
1 begin 

E' = E 

W = {},R={} 
foreach (e in E) do 

W = getFleldsWritten(e, P) 
foreach (e' in E) do 

R = getFieldsRead(e', P) 
if ({Rr\W)i^%) then 
w = \RnW\ 

10 iji = ij} yj {e, ui, e') 

11 end if 
end foreach 

end foreach 
14 end 



Algorithm [T] shows how the EDG is constructed. The 



algorithm takes the program P and a corresponding event- 
flow graph EFG as input, and returns an event-dependency 
graph EDG. Since both EFG, and EDG refer to the same 
set of events, we copy E to E' (line 2). Then, we iterate 
over all pairs of events e, e' (line 4). 

We call the method getFieldsWritten which returns a 
set W of all fields that are written during the execution of 
the event handler of e (line 5). Then, we call the method 
getFieldsRead that returns a set R of all fields which are 
read during the execution of the event handler of e' (line 7). 
If the intersection of R and W is not empty (line 8), we 
add a new edge to the edge which is labeled with the size of 
the intersection (line 10). Note that our algorithm does not 
create an edge between events if the intersection of R and 
W is empty. In this case, there is no data dependency be- 
tween both events and thus, they are not directly connected 
(otherwise the EDG would be fully connected). 

3.2 Event Sequence Generation 

Our event sequence generation is built out of two con- 
secutive steps. First, we select potentially interesting se- 
quences of events, called abstract event sequence, from the 
EDG using Algorithm (2] Second, we use the abstract event 
sequences to generate executable event sequences from the 
EFG using Algorithm 

Algorithm [2] takes an EDG and two parameters as input: 
len gives the maximum length of the abstract event sequence 
to be generated, and top gives the maximum number for 
abstract event sequences to be generated for each event. The 
algorithm returns a set 11 of abstract event sequences. These 
are later used for generating executable event sequences. 



Algorithm 2: Generating abstract event sequences. 

Input: {E, ip) : Event-dependency graph, 

len : max length of abstract event sequence, 

top : max number of abstract event sequences per event 

Output: n : set of abstract event sequences 

1 begin 

2 Sequences of events H = {} 

3 foreach Event e £ E do 

4 Sequences of events 11' = {} 

5 while |n'| < top do 

6 Sequence of events tt = e 

7 Event e' = e 

8 while |7r| < ien Apost(e') ^ {} do 

9 e' = b6stSucc(e', n) 

10 IT = IT m e' 

11 end while 

12 if TT 6 n then break 

13 n' = n' u {tt} 

14 end while 

15 n = nun' 

16 end foreach 

17 return H 

18 end 



For each event e £ E, a, new set 11' of abstract event 
sequences is created, which initially is empty (line 4). As 
long as the size of this set is smaller than top (line 5), we 
add further abstract event sequences (line 6). Each such 
abstract event sequence tt initially contains only e (as we are 
looking for sequences of events that start in e). While the 
length of TT is smaller than len (line 8), and the last event of 
TT still has successors, the method bestSucc is finds the best 
possible successor and adds it to the end of tt (line 10). The 



method bestSucc uses a minimax strategy to identify the 
best successor, unless this successor leads us on a path which 
is already in II'. In this case, bestSucc returns the second 
best choice. We use the minimax strategy to minimize the 
selection of events with low dependencies. 

The loop in line 5 terminates either, if it has collected top 
abstract event sequences that start with the event e or, if the 
algorithm detects a path twice (line 12). In that case, best- 
Succ cannot find a suitable path that has not been visited 
so far. 

For each abstract event sequence in 11, we want to gen- 
erate an executable event sequence. However, the abstract 
event sequences are not necessarily executable, as consecu- 
tive events in the EDG might have no direct connection in 
the EFG. Therefore, we use Algorithm |3] to find one EFG 
path for each of these abstract event sequences, which starts 
in an initial event of the EFG. Note that the only case, where 
such a path does not exist is, if the application is terminated 
between the execution of two events. In that case, we split 
the sequence into two sequences that later on are tested im- 
mediately after each other. 



Algorithm 3: Conversion from abstract event sequences 
to executable event sequences. 

Input: (E,I,5) : Event-flow graph, 

n : set of abstract event sequences 
Output: S : Set of executable event sequences 

1 begin 

2 Sequences of events S = {} 

3 foreach Sequence e^, . . . , ej in 11 do 

4 pick eo from / 

5 Path s = shortestPath(eo, ei) 

6 for k = i to j — 1 do 

7 I s = s • shortestPath(ej; , £4,4.1) 

8 end for 

9 5 = 5U{s} 

10 end foreach 

11 return S 

12 end 



Algorithm[3]takes an EFG and the set 11 of abstract event 
sequences computed by Algorithm [2] as input, and returns 
a set of executable event sequences, which are paths in the 
EFG and start in an initial event. For each sequence of 
events ei . . . Cj in 11 (line 3), a path s (line 5) is created. We 
pick the shortest path from an event eo € / to ei, and then 
iterate over the events in the abstract event sequences and 
always add the shortest path between succeeding events to 
s (line 7). Then we add s to the set S (line 9). Since paths 
in S start in an initial event of the EFG, it can immediately 
executed as a GUI event sequence. 

Infeasible event sequences are only generated if the EFG is 
not complete (e.g., because it was generated automatically) 
or if the data dependency analysis is imprecise. As these are 
implementation issues, we refer to Section 3] for details. 

Figure [2] shows the EDG of our example application. If we 
apply Algorithms [2] with len = 2 and top = 00, Algorithm |3] 
outputs the following executable event sequences: ei writes 
into 63, which results in si — (61,63). Since the field text 
is both read and written in 62, S2 = (62, 62) is generated. €3 
does not write into any other event, and thus, is considered 
in a single event sequence S3 — (es). Finally, €4 writes 
into e2, which leads to S4 = (e3,e4,e2). Because 64 does 



not represent an initial event, the intermediate event 63 is 
inserted. 

I {enabled} | 




I {text} I 

Figure 2: EDG of the Example Application. 

Note that it is not possible to combine EFG and an EDG 
into one graph-based model: On the one hand, it is pos- 
sible to label a directed edge (61,62) in the EFG with the 
weight of the data dependency (e.g., zero in case of a non- 
dependency). On the other hand, a directed weighted edge 
(63, 64) has to be added to the EFG, if a data dependency is 
detected. However, the added edge may represent an event 
sequence, which is not allowed in the GUI. 

4. IMPLEMENTATION 

We integrate an implementation of the grey-box approach 
into GUITAeQ, 

which is a open source, model-based system 
for automated GUI testing. Figure [3] presents an overview 
of the GUITAR system. The grey-highlighted steps in the 
overview emphasize our extensions made to the GUITAR 
system. Considering the grey-box approach, testing an ap- 
plication using the GUITAR system consists of the following 
steps; 




I (2) EFG Construction j »[('^) Event ScquGnce Generator j 



Figure 3: Overview of the GUITAR System, includ- 
ing the Grey-box Extensions. 



4.1 GUI Ripper 

In the first step, the GUI Ripper executes the AUT and 
records the GUI structure. A GUI structure consists of wid- 
gets (e.g., windows, buttons, and text fields) and their cor- 
responding properties (e.g., enabled/disabled, height, and 
width). While executing the AUT, the GUI ripper enumer- 
ates all widgets of the main window using reflection, and 
stores the obtained information in the GUI structure. For 
each found widget (e.g., a button), GUI ripper triggers the 
assigned event (i.e., a button click). For instance, if the click 
on the button opens a new window, GUI ripper continues 
to record the GUI structure of that recently opened window 
and so on. The process stops, if all found windows have been 
explored. Since each GUI represents a hierarchical structure, 
a depth-first search is performed on the AUT's GUI. For the 
grey-box approach, we enhanced the GUI ripper, such that, 

^ http :/ /guitar . sourceforge . net / 



for each widget the event handlers assigned to this widget 
are additionally stored in the GUI structure. This informa- 
tion is needed during the analysis of the bytecode which is 
performed as a part of the EDG construction. 

4.2 EFG Construction 

The GUI structure recorded by the Ripper serves as input 
to the EFG Gonstruction, which automatically constructs 
the EFG that is used for the test case generation. While the 
GUI structure contains information about widgets and their 
properties, the EFG represents an abstract view which only 
contains the events and their following events. The EFG 
construction iterates over all windows in the GUI structure 
and creates a single EFG for each window. Later, these 
single EFGs are connected to one EFG representing the en- 
tire application. For each window in the GUI structure, the 
EFG construction creates an event for the window itself and 
for each containing widget. Then, the EFG construction 
connects events of the window based on their widget prop- 
erties. For instance, if an event 61 represents a window, and 
an event 62 an enabled button in this window, then an edge 
from 61 to 62 is created in the EFG. Assume, that 62 is as- 
sociated to a disabled button, then no edge between 61 and 
62 is created, because the event can not be triggered if the 
window appears. For each window an EFG is created and 
the event which op ens/ accesses other window is connected 
with all initial events of this other window. The details of 
the EFG construction can be found in |21) . 

4. 2. 1 Short Assessment of the EFG Construction 

Since the GUI ripper performs a dynamic analysis of the 
GUI, it cannot be guaranteed to find all widgets of the 
AUT [20]. For instance, the AUT itself might be hostile or 
even faulty, e.g., if the GUI opens a new window in the back- 
ground, the GUI ripper will not be able to find it, and thus, 
it cannot be considered during EFG construction. Further, 
the fact if a widget is enabled or disabled during ripping 
may strongly depends on the environment (e.g., user set- 
tings). These problems tend to be of technical nature and 
their severity might differ depending on the used platform. 

Instead of executing the AUT in GUI ripping, it is in 
general possible to create the GUI structure and the EFG 
respectively via static analysis. However, a static analysis 
technique must be tailored to comprehend how a GUI is 
created. While there exist different code styles for creating 
GUI's, a static technique might find its limitations even if 
a GUI is defined outside the source code of the application, 
e.g., in XML files. 

Note that the EFG of an AUT is not complete and repre- 
sents an approximation of the AUT's event-fiow. It cannot 
be guaranteed that a path in the constructed EFG is actu- 
ally executable on the AUT's GUI. For instance, if a click 
on a button changes the entire parent window (e.g., remov- 
ing or adding widgets), then the GUI ripper and the EFG 
construction respectively does not recognize these changes 
made to the GUI. A test engineer has to improve manually 
the EFG according to the actual behavior of the AUT. 

4.3 EDG Construction 

In order to construct an EDG, we perform a shallow byte- 
code [16] analysis of the AUT to obtain data dependency 
between events. In particular, the bytecode analysis records, 
which fields are read and written by each event handler, that 



is, the functions getFieldsRead and getFieldsWritten in 
Algorithm [T] Hence, the Java bytecode and the constructed 
EFG of the AUT serve as input to the EDG Construction. 
For our bytecode analysis, we use the the ASM frameworlfl. 
Other frameworks such as Soo10 could be used equally well. 

4.3. 1 Bytecode Analysis 

Listing [5] shows the bytecode of the event handler el and 
e4 from the example application in Listing [1] In bytecode, 
fields are read by the instruction GETFIELD, and written by 
PUTFIELD. Further, methods are called using the INVOKcl 
instruction. In line 2, a constant value of is first pushed 
to the stack, and then assigned to field enabled in line 3. 
In line 6 and 7 respectively, field mainWindow and a con- 
stant value of null are pushed to the stack. Field text of 
mainWindow is then assigned with the value null. Finally in 
line 9, method closeDialog of is called. 



contains 



void elOV 
ICDNST_0 

PUTFIELD MainWindow. enabled : Z 
void e4()V 

GETFIELD Dialog . mainWindow : LMainWindow; 
ACOMST_MULL 

PUTFIELD MainWindow. text : L j ava/ lang/ St r ing 
INVOKEVIRTUAL Dialog . cl o s eD ialog () V 



Listing 2: Bytecode Snippet of the Example 
Event Handlers. 

The EDG construction is preceded by one step: the cre- 
ation of a class database (ClassDB). The ClassDB models 
the dependencies between fields, methods and classes of the 
AUT. During EDG construction, a request to the ClassDB 
determines the data dependency of two given event handlers. 
Figure |4] shows the ER model of the ClassDB. 

In order to build a ClassDB, the bytecode analysis starts 
with visiting all classes of the AUT, since classes contain 
both methods and fields. In our implementation, it is possi- 
ble to provide a scope (a set of JAR archives) to restrict the 
set of classes to be analyzed. For instance, only application 
classes are supposed to analyze and third-party libraries are 
discarded. Each class is stored in table Class of the ClassDB 
and is identified by its fully- qualified name, to avoid colli- 
sions if a certain class name is multiply used. 

Then, the bytecode analysis visits all methods of each 
class. Note that it is important to inspect all methods, 
and not only those which are declared as event handlers. 
Moreover, it is necessary to follow all methods calls in each 
method, which can be detected by visiting the INVOKE in- 
structions of the bytecode. For instance, method e4 in List- 
ing [2] calls method closeDialog, which may write further 
fields. Thus, there exist a recursive relationship calls be- 
tween methods. Each method is stored in table Method in 
the ClassDB and is associated to its class. 

For each method, the bytecode analysis fetches all fields 
that are read and written. This is can be detected by visiting 
the GETSTATIC and PUTSTATIC instructions of the bytecode. 
Read and written fields are stored in table Field, where each 
field is associated to its method. 




Method 



calls 



^http:/ /asm. ow2.org/ 

^http:/ /www. sable. mcgill.ca/soot/ 

"INVOKEVIRTUAL, INVOKESTATIC, INVOKESPECIAL 



Figure 4: Simplified ER Model of the ClassDB. 

Once all classes, methods and fields are visited and mapped 
in the ClassDB, Algorithm [T] uses this information to con- 
struct the EDG. For instance, if the algorithm requests the 
getFieldsRead and getFieldsWritten for a certain event e, 
the ClassDB aggregates all called method within the event 
handler of e. For each called method, and for the event 
handler itself, the read and written fields are collected and 
returned to the EDG construction. In this way, a possible 
data dependency between events is captured. Further, due 
to this shallow analysis of the bytecode, the computation 
time for building the ClassDB is low, even for big applica- 
tions. 

4. 3. 2 Short Assessment of the EDG Construction 

Java distinguishes between instance fields and class fields, 
which are treated the same way in our bytecode analysis. 
That is, not only class fields are mapped to a certain class 
in the ClassDB, but also instance fields. Moreover, instance 
fields are not mapped to their objects. Further, the byte- 
code analysis does not distinguish between calls of instance 
methods and class methods and thus, is not reliable regard- 
ing polymorphism. 

The bytecode analysis does not consider potential aliasing 
of fields or potentially infeasible control-fiow. Hence, the 
resulting EDG is only an approximation of the actual data 
dependencies between fields. However, we are interested in 
prioritizing events, so a cheap bytecode analysis in terms of 
computation time is sufficient, while leaving room for further 
in-depth analyses. 

4.4 Event Sequence Generator 

The Event Sequence Generator takes as input an EFG and 
an EDG from the application. In this step, the Algorithms[2] 
and[3]are applied. The output is a set of executable event se- 
quences, where each executable event sequence is embedded 
into one GUI test case. 

4.5 Replayer 

The Replayer is responsible for executing GUI test cases. 
A test case is considered as a precondition, an executable 
event sequence, input-data and an oracle. Figure [S] presents 
an overview of the Replayer process. It consists of the fol- 
lowing steps: (1) it selects an executable event sequence; (2) 
it prepares a test case, which ensures that the precondition 
of the test case holds; (3) it executes the test case on the 
AUT, which performs the executable event sequence; (4) it 
restarts the AUT, which covers the events exit and launch 
of the AUT; (5) it evaluates, whether the test case has failed 
or passed. 

5. EXPERIMENT 

We compare our grey-box approach with the black-box 
approach by studying efficiency and effectiveness. Efficiency 



(1) Select Event Sequence 




(5) Evaluate Results (2) Prepare Test Case 



(4) Restart AUT j^ ^ (3) Execute Test Case j 

Figure 5: Overview of the Replayer Process. 

is considered as tlie computation time for generating tlie 
abstract event sequences(in minutes) and tire time for test 
case execution (in hours). Effectiveness is considered as the 
line and branch coverage (in percentage). We define two 
research questions. Ql: Is the grey-box approacli efficient 
in terms of mean time to execute the test cases? And Q2: 
Is the grey-box approach effective in terms of mean code- 
coverage? 

5.1 Setup of the Experiment 

We evaluate the grey-box approach using four Java-based 
open source applications: TerpWord 4-0 is a, word proces- 
sor, Rachota 2.3 is a time recording system. FreeMind 0.9.0 
creates mind maps, and JabRef 2.7 manages bibliographic 
references. It is important to observe that we use stable 
versions of all applications where bugs are rarely found. We 
choose these applications to consider both small and large 
applications (in terms of # of classes) , and to cover different 
code styles. Table[2]shows some relevant statistics of the Ap- 
plications Under Test (AUTs): the number of lines of code 
(LOG), number of classes (Classes), number of GUI events 
(Events), number of edges in the EFG (EFG edges), and 
number of edges in the EDG (EDG edges). 





TerpWord 


R,achota 


FreeMind 


JabRef 


LOG 


6,842 


13,750 


40,922 


68,468 


Classes 


215 


468 


1,362 


4,027 


Events 


159 


154 


959 


776 


EFG edges 


4,229 


1,493 


105,986 


100,211 


EDG edges 


4,100 


2,172 


25,248 


10,034 



Table 2: Experiment setup. 



Table [3] shows six different configuration used to test the 
four applications. For brevity of exposure we use identifiers 
(ID) to refer to these configurations. 

The black-box approach presented in [20] is used in Con- 
figuration A, B, and C. These configurations are the base- 
line of our experiment. Configuration A generates one event 
sequence (length = 1) for each event in the EFG. Configura- 
tion B (length = 2) generates event sequences for each pair 
of events (ci, Cj), that have a direct connection in the EFG. 
Configuration C (length — 3) generates event sequences for 
each triple of events {ei,ej,ek), where {ei,ej} and {ej,ek} 
are direct neighbors in the EFG. 

The grey-box approach is used in Configuration D, E. 
Configuration D considers abstract event sequences of length 
2 and does not limit the number of abstract event sequences 
generated per event. Configurations E and F have abstract 
event sequences of length 3. Here, the number of generated 
abstract event sequences is limited to 50 and 100 respec- 
tively. This is because each event in the applications Terp- 
Word and FreeMmd has about 25 dependent events. That 



Black-box Approach 


Grey-box Approach 


ID 


Gonfiguration 


ID 


Gonfiguration 


A 


len— 1 


D 


len = 2 


B 


len= 2 


E 


len — 3; top — 50 


C 


len— 3 


F 


len — 3; top — 100 



Table 3: Experiment configurations. 



is, the number of abstract event sequences of this size is al- 
ready large. In particular, we are interested in knowing, if 
doubling the number of generated abstract event sequences 
from 50 to 100 will have a significant impact on the coverage. 

Each generated executable event sequence is embedded in 
a GUI test case, which is executed by the Replayer. The pre- 
condition of each test case states, that all user-settings have 
to be deleted before performing the event sequence on the 
AUT. As an oracle, a crash monitor is used, which records 
any exception found during the test case execution. 

The test cases are executed on 10 Linux machines with a 4 
X 2.0 GHz CPU, 4 GB RAM, 500 GB HDD. The experiment 
was executed two times with the same setup (e.g., the same 
seed for input-data) to ensure that the obtained results are 
reproducible. The total number of test cases executed is 
6,089,403. 

5.2 Experimental Results 

Table|4]shows a summary of the experimental results. For 
each configuration we report the number of event sequences 
{# es), broken event sequences {# broken es), total gener- 
ation time {gen t (m)), generation time per event sequence 
{gen t per es (s)), total execution time {exec t (h)), exe- 
cution time per test case [exec t per tc (s)), line coverage 
{line cov.) and branch coverage {branch cov.). The event se- 
quence generation time is expressed in minutes and the test 
case execution time in hours. The generation time per event 
sequence and execution time per test case are expressed in 
seconds. 

Observation 1: Configuration A has the smallest num- 
ber of event sequences and the lowest coverage. The num- 
ber of event sequences corresponds to the number of events, 
which means that each event is only tested once. However, 
we believe that this approach is useful for smoke tests [23j . 
since the generation and execution time is also the lowest 
amongst all the configurations. 

Observation 2: As expected, Configuration D is signif- 
icantly more efficient than Configuration B on the applica- 
tions TerpWord, FreeMind, and JabRef. For these appli- 
cations, both configurations have the same line and branch 
coverage. However, Configuration D uses significantly fewer 
event sequences and consumes less time than Configuration 
B. Thus, the grey-box approach generates more efficient 
event sequences than the black-box approach for these three 
applications. 

Observation 3: For Rachota, Configuration D attains a 
higher coverage than Configuration B. However, more event 
sequences are generated in Configuration D, and the execu- 
tion consumes more time. This is likely owing to the fact 
that in Rachota, the EDG has significantly more edges than 
the EFG. 

Observation 4: The number of generated event sequences 
of JabRef in Configuration C exceeds 5 million. Compar- 
ing to Configuration B, the obtained coverage for JabRef is 



I TcrpWord | Rachota | FrccMiiid | JabRcf 
Configuration A (Black-Box Approach) 



# cs 


159 


154 


959 


776 


# broken cs 








5 


5 


gen t (m) 


0.3 


0.28 


1.10 


1.08 


gen t per es (s) 


0.12 


0.12 


0.12 


0.12 


exec t (h) 


0.50 


0.58 


4.58 


4.22 


exec t per tc (s) 


12 


15 


30 


28 


line GOV. (%) 


41 


60 


50 


51 


branch cov. (%) 


22 


31 


36 


22 


Configuration B (Black-Box Approach) 


# OS 


3,307 


1,310 


11,396 


43,017 


# broken cs 








57 


258 


gen t (m) 


6.62 


2.62 


24.68 


93.2 


gen t per es (s) 


0.12 


0.12 


0.13 


0.13 


exec t (h) 


11.94 


5.82 


98.13 


358.48 


exec t per tc (s) 


13 


16 


31 


30 


line cov. (%) 


55 


61 


53 


54 


branch cov. (%) 


36 


34 


37 


26 


Configuration C (Black-Box Approach) 


# cs 


79,949 


20,221 


489,250 


5,360,366 


# broken cs 








2,446 


32,162 


gen t (m) 


159.90 


40.44 


1,223.13 


15,187.70 


gen t per es (s) 


0.12 


0.12 


0.15 


0.17 


exec t (h) 


310.92 


95.49 


4,348.89 


44,669.72 


exec t per tc (s) 


14 


17 


32 


30 


line cov. (%) 


55 


62 


53 


55 


branch cov. (%) 


36 


36 


38 


27 


Configuration D (Grey-Box Approach) 


# OS 


2,695 


1,407 


9,944 


5,860 


# broken cs 








63 


83 


gen t (m) 


7.63 


4.22 


43.08 


20.52 


gen t per es (s) 


0.17 


0.18 


0.26 


0.21 


exec t (h) 


9.73 


6.25 


88.39 


48.83 


exec t per tc (s) 


13 


16 


32 


30 


line cov. (%) 


55 


62 


53 


54 


branch cov. (%) 


36 


36 


37 


26 


Configuration E (Grey-Box Approach) 


# cs 


2,068 


1,781 


7,113 


9,595 


# broken cs 








45 


135 


gen t (m) 


6.55 


5.93 


35.57 


36.78 


gen t per es (s) 


0.19 


0.2 


0.3 


0.23 


exec t (h) 


9.19 


8.91 


65.20 


90.62 


exec t per tc (s) 


16 


18 


33 


34 


line cov. (%) 


47 


62 


53 


55 


branch cov. (%) 


26 


36 


38 


27 


Configuration F (Grey-Box Approach) 


# OS 


4,036 


3,307 


12,904 


18,497 


# broken cs 








81 


261 


gon t (m) 


13.45 


11.57 


68.82 


77.07 


gen t per es (s) 


0.2 


0.21 


0.32 


0.25 


exec t (h) 


17.94 


17.45 


118.29 


179.83 


exec t per tc (s) 


16 


19 


33 


35 


line cov. (%) 


55 


62 


53 


55 


branch cov. (%) 


36 


36 


38 


27 



Table 4: Results of the Experiment. 



disappointing with respect to the generation and execution 
time. On the other hand, Configuration E and F are sig- 
nificantly more efficient on application Rachota, FreeMind, 
and JabRef, since fewer event sequences are generated and 
executed while preserving the same line and branch cover- 
age. 

Observation 5: For TerpWord, Configuration E attains 
a lower line and branch coverage than for Configuration D 
and F. Thus, the parameter top influences the quality of 
the selected event sequences. In TerpWord, there are a few 
events which have more than 50 dependent events. So, set- 
ting tcyp = 50 might not be effective enough. The Configu- 
ration F achieves more coverage when setting top = 100. 



Observation 6: For the grey-box Configurations D, E 
and F, increasing the length of the abstract event sequences 
does not significantly improve the coverage. 

Observation 7: The number of broken event sequences 
is relatively low comparing to the total number of event se- 
quences; they ranges between 0,5% and 1,4%. Broken event 
sequences are sequences sampled from the EFG, but could 
not be executed due to the limitations described in Sec- 
tion |421 

Observation 8: The experiment found 3 different bugs: 
The first bug is found in JabRef with Configuration B and 
Configuration D. A NullPointerException is thrown if the 
user clicks Options, Manage custom imports. Add from fol- 
der. Cancel. The bug is found in the black-box and in the 
grey-box approach using a sequence length of 2. 

Observation 9: The second bug was found in JabRef 
with Configuration D. The following sequence of events causes 
an ArrayOutOf BoundsException: (1) In the main window, 
click Manage content selectors, which opens a new dialog; 
(2) switch to the main window and choose Close database. 
Then, (3) switch back to the previously opened dialog and 
click OK. The error occurs, because the new opened dialog is 
started modeless, which allows the user to close the database, 
although the dialog still suggests the user to modify the 
database. 



[ Close database 




(a) EFG 

Manage conttmt^'^k^ors^ ^^ 

(b) EDG 

Figure 6: EFG and EDG snippet of JabRef. 

Figure [6] shows the EFG and EDG of JabRef that cor- 
responds to the found bug. In the event sequence genera- 
tion for event Close database, the grey-box approach de- 
tects the data dependency to event OK. This data depen- 
dency (weight = 2) consists of a field for JabRef's meta- 
data, which is written in OK and read in Close database. 
Thus, the abstract event sequence (Close database, OK) is 
generated. This abstract event sequence is converted into 
an executable event sequence, because there exists no corre- 
sponding path in the EFG. Algorithm O picks the shortest 
path from an initial event to Close database, and the short- 
est path between succeeding events to OK, which leads to 
the following executable event sequence: (Manage content 
selectors. Close database. Manage content selectors, 
OK). The black-box approach will be able to detect this 
failure using a event sequence of length 4. However, it will 
first need to generate and execute all possible sequences of 
length 4. 

Observation 10: The third bug was found in Rachota 
with Configuration D. The following sequence of events causes 



a NullPointerException at restart: (1) Click on System 
settings; (2) Add a new task (Add task) and leave the 
text fields blank; (3) click the OK button (QK2). Then, (4) 
click on the OK button (0K2), that writes all tasks to a file. 
The errors occurs, because the new added task contains a 
null value when it is written to the user settings. Then, a 
null-reference is returned when the user settings are read, 
which is not correctly handled. 

»| System settings j " ^Add task] 
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(b) EDO 

Figure 7: EFG and EDG snippet of Rachota. 

Figure [7] shows the EFG and EDG of Rachota that cor- 
responds to the bug. When generating abstract event se- 
quences, we choose those sequences with the highest edge 
values first. For event 0K2, the first abstract event sequence 
is (0K2, System settings), with a weight of 6. Our sec- 
ond abstract event sequence is (0K2, OKI), with a weight of 
6. Since this abstract event sequence is not allowed to ex- 
ecute in its current form, it is converted into an executable 
event sequence using Algorithm [S] Hence, the final exe- 
cutable event sequence that can be run on the application is 
(System settings. Add task, 0K2, System settings, OKI). 
The black-box approach will be able to detect this failure 
using an event sequence of length 5. However, the number 
of event sequences with length 5 would be 6,605,912, since 
Rachota consists of 154 events. 

6. DISCUSSION 

Regarding the research question of our experiment we 
found that the answer of Ql is yes: Considering possible 
data dependencies between events lead to fewer event se- 
quences and decreases the time to run all test cases. The 
answer of Q2 is no: We did not find enough evidences to 
show an improvement of the effectiveness. 

The main result is that the grey-box approach in most 
cases produces a lot less test cases (each generated abstract 
event sequence represents one test case) than the black-box 
approach. Our initial assumption is that in GUIs several 
widgets carry out completely independent tasks. For ex- 
ample, a toolbox usually offers save, print, copy, undo/redo 
and find. However, print is unlikely to have a side effect on 
all other widgets. Thus, it is not efficient to test all com- 
binations of print plus one other event (see Configuration 
B). Here, the grey-box approach can achieve significant im- 



provements. For Rachota, more abstract event sequence are 
generated than in the black-box approach. The reason is 
that the events in Rachota have a lot of dependencies to 
other events. In this case, more edges in the EDG than 
in the EFG are obtained, and thus, more abstract event se- 
quence are generated. However, we observe that in the other 
AUTs the number of EFG edges is higher than the number 
of EDG edges. 

Increasing the length of event sequences in both approaches 
(black-box and grey-box) implies a considerably increase of 
the generation time. However, if we compare with the ex- 
ecution time, the generation time itself is not a big issue. 
Moreover, in practice, the testing process can be very lim- 
ited in terms of resources and time to generate and execute 
all event sequences and test cases respectively. In this way, 
we could adapt our approach to an on-the-fly test case gen- 
eration, where a specific timeout is given and parameter like 
Zen and top are not fixed, but vary in a range. 

Using the grey-box approach, two difi'erent bugs were found, 
that were not found in the black-box approach, and, there 
are two main reasons: (1) all abstract event sequences incor- 
porate data dependencies in the application's bytecode, and 
(2) the abstract event sequences have a non-fixed length. 
For instance, while Configuration A, B, and C select events 
that are directly connected. Configuration D, E, and F se- 
lect events based on their data dependencies. Thus, the 
executable event sequence length in these grey-box config- 
urations may vary comparing to the black-box configura- 
tions. Further, in the grey-box approach the length of an 
executable event sequence can be very long, e.g., if the dis- 
tance of events in an abstract event sequence, in terms of 
intermediate events, is very high in the EFG. 

The overall code coverage reported in our experiment is 
relatively low for several reasons. For instance, key strokes 
(KeyListener) and mouse gestures (MouseListener) are not 
yet considered, but frequently used in the application Free- 
Mind, in order to draw a mind-map via mouse interactions. 
Support for these events in the GUI ripper and Replayer is 
scheduled for the future release of GUITAR. Moreover, the 
use of random input-data may lead to the execution default 
branches in the applications. 

7. THREATS TO VALIDITY 

We report 2 threats to internal validity. The first is the 
experiment replication. Almost all applications store user 
settings to the HDD, such as enabled and disabled toolbars, 
recently opened files etc. In order to ensure the precondition 
(i.e., the system's state) for each run of a test case, it is 
important that those user settings have to be deleted before 
execution. Otherwise, test cases may mistakenly fail, e.g., a 
GUI component is not found due to an existing user setting. 
In order to decrease this threat to internal validity we ran 
the experiment twice and got the same result. 

The second is that some applications are strongly con- 
nected to the date and time of their execution. For instance, 
GUI components like calendar controls are considered in the 
GUI ripper and in the construction of the EFG. When re- 
playing the test cases, some of them may fail, because the 
GUI components are not recognized anymore (during re- 
playing the calendar control shows a different date as the 
calendar control was ripped). 

One threat to external validity is the portability of the 
configurations. For instance, mobile phones have a different 



environment and the construction of the EFG and EDG can 
be completely different. In principle, there is no reason to 
believe that the grey-box approach is not applicable to other 
platforms. To generalize the approach to other platforms, 
we must first port the ripper and replayer tools. Further, 
the model implementations have to be adapted to the cor- 
responding environment. In this way, we believe that our 
approach can be generalized to different platforms. 

8. RELATED WORK 

Several approaches for modeling GUI-based applications 
have been developed for test case generation. 

Model-based GUI testing: Different models can be 
used for event sequence generation [21 1251 126) . For exam- 
ple, AI planning techniques are used in [22] ; covering arrays 
in [21]. Event sequences are generated from these models 
and executed as test cases on the GUI to validate its behav- 
ior. In the grey-box approach, the EDG, created by ana- 
lyzing bytecode, is used to generate event sequences. In [H], 
symbolic execution is used to find adequate inputs for event 
sequences. While symbolic execution is a powerful technique 
to find precise input values, it's applicability is limited due 
to the complexity of the used algorithms. In contrast, the 
grey-box approach only tries to identify simple data depen- 
dencies without tracking the actual value of fields, and thus 
it is applicable for reasonably sized applications. In [19], a 
method to dynamically observe a program's behavior at ex- 
ecution time is presented. Instead of analyzing the source 
code, an analysis of the call stack at run-time is performed. 
Event Sequences are then generated such that a minimum 
set covers a maximum possible set of program execution 
paths. In |17l I18j the AutoBlackTest approach is presented, 
which constructs a GUI model by learning how the GUI 
interacts with the system functionalities. Then, the tool se- 
lects an executable and non-redundant test suite. They also 
compare with the GUITAR approach. However, we could 
not empirically compare with AutoBlackTest since it is not 
available at the moment. In [2^, feedback obtained by ex- 
ecuting an event sequence is used to generate an improved 
test suite. It is an iterative method where GUI run-time 
feedback is used instead of source code information. In [T^ 
the execution of a GUI-based application is represented as 
a sequence of events and output states. A state graph for 
the GUI is built which makes it possible to apply code based 
testing methods to GUIs. 

Byte, Binary and Source Code Analysis: Many tools 
are available for reachability analysis and state space explo- 
ration of programs using the byte, binary, or source code. 
For example, JavaPathFinder [11] works at the Java byte- 
code level to identify deadlocks, assertion violations and 
other properties of the program using heuristics for reduc- 
ing the state space explosion. Soot [15] is designed to be a 
framework to allow researchers to experiment with analyses 
and optimizations of Java bytecode. In the grey-box ap- 
proach, we are interested in detecting sequences of events, 
which eventually bring the system to a failure state. Hence, 
we decide to implement a light-weight bytecode analysis, 
which can be enhanced by the support of alternative tools. 

Search- based testing: In [T] a search-based testing tech- 
nique is proposed. Unit tests for Java classes and methods 
are generated by looking for tests that satisfy given heuris- 
tics. Another approach using search-based testing is pro- 
posed in [6]. Heuristics are used to generate test cases that 



violate automated test oracles. In the grey-box approach, a 
data dependency can be seen as a heuristic, which helps to 
sample the user-level model (EFG) more efficiently. 

The grey-box approach is similar to the generation of se- 
quences of method calls, e.g., in libraries. However, when 
system testing an application through its GUI, not all meth- 
ods (event handlers) may be available. For instance, a check 
box is likely to have no separate event handler, which changes 
the value from selected and deselected, once a user clicks 
this check box. This behavior may be implemented in the 
GUI framework and is not existing in the application it- 
self Without a user-level model it is difficult to generate a 
proper event sequence, if the value of the check box is evalu- 
ated in a further event handler (method) within the applica- 
tion, while it was changed in the GUI framework. Further, 
providing precise input for data-bound widgets, e.g., for an 
event handlers that governs a text box, is tough. Trans- 
ferring input-data to a text box during test case execution, 
e.g., via reflection, may violate an invariant of the class. For 
instance, when the text box is disabled with regard to the 
event-flow, and does not accept any input. 

9. CONCLUSION AND FUTURE WORK 

We presented a new automatic grey-box approach for GUI 
event sequence generation. An EFG is generated automat- 
ically by observing the GUI at run-time (black-box). In 
addition, the application's bytecode is analyzed to find data 
dependency between event handlers (white-box) to generate 
a model called event-dependency graph (EDG). Abstract 
event sequences representing data dependencies are first gen- 
erated from the EDG. These are then converted into Exe- 
cutable event sequences by looking up the EFG. 

The grey-box approach incorporates 2 main steps: (i) 
model construction (EFG and EDG), and (ii) event sequence 
generation. The approach improves event sequence genera- 
tion by producing fewer test cases and avoids generating 
event sequences where consecutive events share no data de- 
pendencies. Empirical evaluation shows that the grey-box 
approach decreases the time to generate event sequences and 
the time for executing test cases while retaining coverage. 

Utilizing a black-box and a white-box model for the gen- 
eration of event sequences is promising: We plan to improve 
the creation of the models and the generation of event se- 
quences: 

Model Creation: We plan to enhance the analysis of 
event handlers and the computation of the weight between 
two events respectively. Table [4] shows the potential for in- 
creasing the coverage of the AUT's. We believe that analyz- 
ing conditionals, i.e., if-, switch-, and loop-statements, can 
lead to the execution of more lines and branches. In the long 
run, the grey-box approach is supposed to provide a frame- 
work, where different black-box and white-box techniques 
can be plugged to generate event sequences. For instance, 
one would like to guide a dynamic symbolic execution [9] 
based on the EFG, or wrap a GUI application in a set of 
parameterized unit tests [?]• 

Event Sequence Generation: Typically, applications 
contain a subset of events with a relatively high number of 
dependent events, e.g., the system settings are read in many 
other event handlers. The grey-box approach enables us to 
identify these events, which we call hot spots. Intuitively, hot 
spots may be fault prone owing to inter-procedural data de- 
pendencies. In a future work, we plan to specifically analyze 



hot spots while generating event sequences. More precisely, 
our event sequences generation uses the parameters len and 
top while generating a set of executable test cases. Consid- 
ering hot spots might be useful to limit the event sequence 
generation and test case execution to a specific timeout. The 
idea is to spend a certain time on the testing of highly de- 
pendent events. 

Evaluating the fault detection effectiveness is an impor- 
tant aspect of automated event sequence generation tech- 
niques. As a future work, we consider to evaluate the ef- 
fectiveness of sequences generated from the EDG. We will 
start with fault-seeded versions of the application, and then 
naturally occurring faults in fielded applications. In order 
to have strong evidence about the experiment, we plan to 
execute the best configuration with different seeds. 
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